Meta-learning beyond classification: A framework for information extraction from the Web

نویسندگان

  • Georgios Sigletos
  • Georgios Paliouras
  • Constantine D. Spyropoulos
  • Takis Stamatopoulos
چکیده

This paper proposes a meta-learning framework in the context of information extraction from the Web. The proposed framework relies on learning a meta-level classifier, based on the output of base-level information extraction systems. Such systems are typically trained to recognize relevant information within documents, i.e., streams of lexical units, which differs significantly from the task of classifying feature vectors that is commonly assumed for metalearning. The proposed framework was evaluated experimentally on the challenging task of training an information extraction system for multiple Web sites. Three well-known methods for training extraction systems were employed at the base level. A variety of classifiers were comparatively evaluated at the meta level. The extraction accuracy that was obtained demonstrated the effectiveness of the proposed framework of collaboration between base-level extraction systems and common classifiers at meta-level.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

1 Learning Named-Entity Recognition Rules

This paper defines a new stacked generalization framework in the context of information extraction (IE) from online sources. The proposed setting removes the constraint of applying classifiers at the base-level. A set of IE systems are trained instead to identify relevant fragments within text documents, which differs significantly from the task of classifying candidate text fragments as releva...

متن کامل

Developing a New Method in Object Based Classification to Updating Large Scale Maps with Emphasis on Building Feature

According to the cities expansion, updating urban maps for urban planning is important and its effectiveness is depend on the information extraction / change detection accuracy. Information extraction methods are divided into two groups, including Pixel-Based (PB) and Object-Based (OB). OB analysis has overcome the limitations of PB analysis (producing salt-pepper results and features with hole...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003